-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
txn: fix the resolved txn status cache for pessimistic txn #21689
Conversation
Please follow PR Title Format:
Or if the count of mainly changed packages are more than 3, use
|
@youjiali1995 |
62051c8
to
2a20028
Compare
/run-all-tests |
@@ -1333,6 +1337,12 @@ func (mvcc *MVCCLevelDB) ResolveLock(startKey, endKey []byte, startTS, commitTS | |||
mvcc.mu.Lock() | |||
defer mvcc.mu.Unlock() | |||
|
|||
if len(startKey) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temproraly solving some mvcc leveldb problem without which resolve lock will not work for splitted regions.
store/tikv/lock_resolver.go
Outdated
// - always cache the check txn status result. | ||
// For prewrite locks, their primary keys should ALWAYS be the correct one. | ||
if l.LockType != kvrpcpb.Op_PessimisticLock && status.ttl == 0 { | ||
logutil.Logger(bo.ctx).Info("saved resolved txn status", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can it print a lot?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could remove this log.
@@ -1833,3 +1835,71 @@ func (s *testPessimisticSuite) TestAmendForUniqueIndex(c *C) { | |||
tk.MustExec("commit") | |||
tk2.MustExec("admin check table t") | |||
} | |||
|
|||
func (s *testPessimisticSuite) TestResolveStalePessimisticPrimaryLock(c *C) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's complex. I want to confirm Does it fail with the always cached version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested without cache change, it will fail with error reported from admin check
statement. Could be treated as a reproducing case in unit-test, we need also add cases in ticase I think.
store/tikv/lock_resolver.go
Outdated
zap.Stringer("status action", status.action), | ||
zap.Uint64("status ttl", status.ttl), | ||
zap.Uint64("status commitTS", status.commitTS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems it never changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could remove this log.
store/tikv/lock_resolver.go
Outdated
// If l.lockType is prewrite lock type: | ||
// - always cache the check txn status result. | ||
// For prewrite locks, their primary keys should ALWAYS be the correct one. | ||
if l.LockType != kvrpcpb.Op_PessimisticLock && status.ttl == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems in the current code, we don't cache when the primary lock is prewrite lock type. Can we move this logic to where saveResolved
originally was (L579)? We have more information there. And we can even cache it when the primary lock is committed with a positive ts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to cache all results, we need to change kvproto.
And we can even cache it when the primary lock is committed with a positive ts.
It seldom happens...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could not get the lock type for the lock we met in the original place, which is needed for the cache check. We could add back the saving logic for commited transaction in the original place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay i think it's optional as youjiali1995 says this case seldom happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@youjiali1995 @sticnarf
In the previous diff the resolved cache would be checked everytime which is not necessary, I put back the cache fill related logic to its original place, PTAL again, thx
@@ -652,6 +652,10 @@ func (c *twoPhaseCommitter) doActionOnGroupMutations(bo *Backoffer, action twoPh | |||
// by test suites. | |||
secondaryBo := NewBackofferWithVars(context.Background(), CommitMaxBackoff, c.txn.vars) | |||
go func() { | |||
failpoint.Inject("skipCommitSecondaryKeys", func() { | |||
failpoint.Return() | |||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about to make this failpoint more general (eg. like the following) so that we can either disable it by return("skip")
or delay it by sleep(1000)
?
failpoint.Inject("beforeCommitSecondaries", func(v failpoint.Value) {
if s, ok := v.(string); !ok {
} else if s == "skip" {
failpoint.Return()
}
})
/run-all-tests |
store/tikv/lock_resolver.go
Outdated
@@ -229,6 +229,9 @@ func (lr *LockResolver) BatchResolveLocks(bo *Backoffer, locks []*Lock, loc Regi | |||
if err != nil { | |||
return false, err | |||
} | |||
if l.LockType != kvrpcpb.Op_PessimisticLock && status.ttl == 0 { | |||
lr.saveResolved(l.TxnID, status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you combine TxnID
and Primary's LockForUpdateTS
together as the key of LockResolver.resolved
as a cache. I'm worried about the performance impact of removing the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The impact now is resolving pessimistic locks whose primary lock is prewrite lock type, will need more check_txn_status
calls, we could make the returnd status certain or not in the future to lower the risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/run-unit-test |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/merge |
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What problem does this PR solve?
Issue Number: close #xxx
Problem Summary:
What is changed and how it works?
What's Changed:
How it Works:
Related changes
pingcap/docs
/pingcap/docs-cn
:Check List
Tests
Side effects
Release note